Reducing Software Overheads in Parallel Linear Algebra Libraries

نویسنده

  • Peter Strazdins
چکیده

A list of technical reports, including some abstracts and copies of some full reports may be found at: Abstract. Software overheads can be a signiicant cause of performance degradation in parallel numerical libraries. This paper examines the nature and extent of software overheads in an implementation of parallel LAPACK on distributed memory multiprocessors, where block-partitioned algorithms with a general block-cyclic matrix distribution scheme present special challenges. It then describes various techniques that have been used to reduce these overheads, and evaluates their eeectiveness. While there is a tradeoo between the software engineering properties of high data and procedural abstraction, modularity and portability (which are particularly important in parallel programming) and achieving low software overheads, it is shown that a good balance can be achieved in the case of parallel LAPACK, at least for important classes of computations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Scalable Parallel 2 D Wavelet Transform Algorithm

A list of technical reports, including some abstracts and copies of some full reports may be found at: TR-CS-97-15 Peter Strazdins. Reducing software overheads in parallel linear algebra libraries. July 1997. Abstract We present a new parallel 2D wavelet transform algorithm with minimal communication requirements. Data are transmitted between nearest neighbors only and the amount is independent...

متن کامل

Exposing Inner Kernels and Block Storage for Fast Parallel Dense Linear Algebra Codes⋆

Efficient execution on processors with multiple cores requires the exploitation of parallelism within the processor. For many dense linear algebra codes this, in turn, requires the efficient execution of codes which operate on relatively small matrices. Efficient implementations of dense Basic Linear Algebra Subroutines exist (BLAS libraries). However, calls to BLAS libraries introduce large ov...

متن کامل

Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems

We present a review of the current best practices in parallel programming models for dense linear algebra (DLA) on heterogeneous architectures. We consider multicore CPUs, stand alone manycore coprocessors, GPUs, and combinations of these. Of interest is the evolution of the programming models for DLA libraries – in particular, the evolution from the popular LAPACK and ScaLAPACK libraries to th...

متن کامل

Software Libraries for Linear Algebra Computations on High Performance Computers 1 Software Libraries for Linear Algebra Computations on High Performance Computers

This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under develo...

متن کامل

Technical Paper Accepted for Publication in Siam Review Software Libraries for Linear Algebra Computations on High Performance Computers 1 Software Libraries for Linear Algebra Computations on High Performance Computers

This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under develo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997